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Abstract — Distributed computing architectures utilize a set 
of computational elements (CEs) to achieve performance that 
is not attainable on a single CE. Conventional load balancers 
have proven effective in increasing the utilization of CPU, 
memory, and disk I/O resources in a Distributed environment. 
However, most of the existing load-balancing schemes ignore 
network resources, leaving an opportunity to improve the 
network resources like delay or effective bandwidth of 
networks running parallel applications. Load balancing 
becomes more challenging in interactive applications as load 
variation is very large and the load on each server may change 
dramatically over time, by the time when a server is to make 
the load migration decision, the collected load status from 
other servers may no longer be valid. This will affect the 
accuracy, and hence the performance, of the load balancing 
algorithms. All the existing methods neglect the effect of 
network delay among the servers on the accuracy of the load 
balancing solutions. In this paper, due to the change in the 
load of the server, network delay would affect the performance 
of the load balancing algorithm will be discussed. 

Index Terms — Load balancing, Delay, CPU Resources, I/O 
resources, Computing Elements, Tasks 

I. Introduction 

The performance of LB in delay-infested environments 
depends upon the selection of balancing instants as well as 
the level of load-exchange allowed between nodes. A number 
of load-balancing schemes have been developed, primarily 
considering a variety of resources, including the CPU, 
memory, disk I/O, or a combination of CPU and memory 
resources. These approaches have proven effective in 
increasing the utilization of resources, assuming that network 
interconnects is not potential bottlenecks [2], For example, if 
nodes have inaccurate information about the state of other 
nodes, due to random communication delays between nodes, 
then this could result in unnecessary periodic exchange of 
loads among them. Consequently, certain nodes may become 
idle while loads are in transit, a condition that would result in 
prolonging the total completion time of a load. In general, 
load-balancing techniques fall into two categories. 

A. Centralized Load Balancing 

In centralized methods, a central server makes the load 
migration decisions based on the information collected from 
all local servers and then passes the decisions to local servers 
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for them to carry out the load migrations. 

B. Decentralized Load Balancing 

In decentralized methods, each local server could make 
load migration decisions with information collected from its 
neighbour servers. Decentralized methods are more efficient 
than centralized methods as they perform the load balancing 
process by considering only the local load information. For 
interactive applications the load on each server may change 
dramatically over time, by the time when a server is to make 
the load migration decision, the collected load status from 
other servers may no longer be valid [1]. 

Two types of load balancing algorithms [10]: 

C. Static Load- Balancing 

In this method, allocation is made when the process is 
created and cannot be changed during process execution to 
make to make changes in the system load. 

D. Dynamic Load Balancing 

In this the workload is distributed among the nodes at 
runtime. Unlike static algorithms, dynamic algorithms allocate 
processes dynamically when one of the processors becomes 
under loaded. 

II. Literature Survey 

There are many researches focusing on the issue of 
distributed load balancing for CPU and memory resources. 
Harchol-Balter and Downey [5] developed a CPU-based pre- 
emptive migration policy which shown to be more effective 
than non pre-emptive migration policies. Zhang et al. [6] 
studied load-sharing policies which consider both CPU and 
memory services among the nodes. An I/O-aware load 
balancing scheme is proposed by Xiao Qin to meet the needs 
of a cluster system with a variety of workload conditions 
[7] .Above approaches does not consider the balancing of 
communication load. A communication-sensitive load 
balancer was proposed by Cruz and Park [8] . Communication 
aware load balancing scheme attempts to simultaneously 
balance two different kinds of I/O load, namely, 
communication and disk I/O [2]. Different load balancing 
techniques which are differentiated based on the centralized 
approach and on the decentralized approach, as well as those 
that concern the delays have been proposed. In decentralized 
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approach is that local servers, i.e., servers managing a group 
of nodes, perform the load balancing process individually. 
Each server will determine the amount of load to be transferred 
its neighbour servers. 

In traditional load balancing application domains that 
consider the delay in the design of load balancing are adaptive 
mesh analysis and queuing analysis. In adaptive mesh 
analysis, [4] proposes a dynamic load balancing scheme. A 
method is divided into global and local load balancing 
processes, with the objective of minimizing remote 
communications by reducing the number of load balancing 
processes among distant processors. 

In queuing analysis, [9] conduct extensive analysis and 
reveal that computational delays and load-transfer delays 
can significantly degrade the performance of load balancing 
algorithms that do not account for any delays. Further 
extension of this work is proposed to consider the random 
arrivals of the external tasks. Both of them try to minimize the 
task completion time due to network and/or computational 
delays. On the contrary, [1] argue here that when the local 
servers have received the load balancing solutions from the 
central server after some network delay, the loads of the local 
servers may be very different and the load balancing solutions 
may no longer be accurate. 

III. System Study 

A. Communication - Aware Load Balancing for Parallel 
Applications on Clusters: 

In this paper, they have focussed on designing an 
approach at the software level to achieve high effective 
bandwidth communication without requiring any additional 
hardware is proposed. COM-aware load balancing enables 
cluster to utilize most idle, or underutilized, network resources 
while keeping the usage of other types of resources 
reasonably high. 

An application model is introduced, which aims at 
capturing the typical characteristics of the communication, 
disk I/O, and CPU activity within a parallel application. It is 
applicable for both communication and I/O intensive parallel 
applications. The execution time model for a parallel job 
running on a dedicated cluster environment can be derived 
as follows. Given a parallel job with p identical processes, the 
execution time of the job can be calculated as 



T, =S p + ^{MAX^(Tl CPU +T; C0M +T] Disk )} 



(01) 



Where Tj'cpu, T^eOM ■ Tpi denote the execution time 
of process j in the i* phase on the three prospective resources. 
Communication aware load-balancing scheme: 
A dynamic, communication aware load-balancing scheme 
for non dedicated clusters has been proposed. To measure 
the communication load imposed by these processes. Let a 
parallel job formed by p processes be represented by f t 
t j and where n. is the node to which t. is assigned. Assuming 
that f fl is a master process, and t.(0 <j < p) is a slave process. 
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Let L ^ O „denote the communication load induced by ? , which 
can be computed with the following formula, where T 1 . C0M is 
the same as the one used in (01): 



-•j.p.COM 



y N r 





j*0,n.=n 



(02) 



I 

l<k<p,n k *t}j i=l 



The communication load on node i, L , as( follows: 



(03) 



To balance the communication load the processes need 
to select remote nodes, the following two criterions must be 
satisfied to avoid useless migrations: 

Let nodes i and j be the home node and candidate remote 
node for process t. 

Criterion 1 : is to bes satisfied to guarantee that the load 
on the home node will be effectively reduced without making 
other nodes overloaded. This can be formally expressed as: 

m.COM ~Lj£OM ) > L,,p,COM (04) 

Criterion 2: Estimated response time of t on node j is less 
than its local execution. If and are the estimated 



response time of t on nodes i and j, respectively, 
migration cost for process t. 



r; <R:+KL g 



R| J , the 



(05) 



The experimental results show that the COM-aware 
approach can improve the performance by up to 206 and 235 
percent, in terms of slowdown and turn-around time, 
respectively, under high communication demands. 

B. On Delay Adjustment for Dynamic Load Balancing in 
Distributed Virtual Environments 

As DVE systems are highly interactive and the load on 
each server may change dramatically over time, by the time 
when a server is to make the load migration decision, the 
collected load status from other servers may no longer be 
valid. Due to communication delays among servers, the load 
balancing process may be using outdated load information 
from local servers to compute the balancing flows, while the 
local servers may be using outdated balancing flows to 
conduct load migration. 

A dynamic load balancing method based on the centralized 
approach is proposed. The load distribution of the n servers 
of the DVE system is therefore {l v l 2 ,l 3 IJ- To formulate a 
server graph is constructed as G = (S;E), The server graph is 
a weighted undirected graph. The weight associated with 
each node Si is represented by its load / and the weight 
associated with each edge e. is called diffusion coefficient 
denoted by c... The balancing flow can then be formulated as 
follows: 

1=^-1^^ (06) 
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Where ; is the average load over all nodes. 

The global heat diffusion algorithm has two main pro- 
cesses: 

Global Migration Planning (GMP): It is to compute the 
balancing flows to solve the load balancing problem. This 
process is carried out by the central server. 

Local Load Transfer (LLT): it is to transfer load between 
adjacent servers based on the amount indicated by the bal- 
ancing flows. This process is carried out by each local server 
that manages a partition. 

The Delay Adjustment Schemes 

1. Uniform Adjustment Scheme 

Due to the communication delay, the balancing flow received 
by S. is given by: 

(*) = x i-j i t* ~ & * i > t + A *f ) (° 7 ) 
Now, we adjust the balancing flow as follows: 

A^|Ct + At£ = A^Ct) + Al^t - % t + AtJ (08) 

Where ^(t+AtJ 

is the adjusted balancing flow in S at 
This is only an approximation method, which assumes 

that the contribution to the net increase of S 's load is uniform 

i 

among S 's neighbor servers. 

2. Adaptive Adjustment Scheme 

The load variation of Si can be computed as: 
AI E (t - At,, t + M-) 

= Pj^ii -M^t+AtJ- p^j(t - M it t + Atj) (09) 

-Mfc-tjCt - % t + atf) - ft^ ft C£ - Atj,t + A*j) 

Pj^itt -A ii, t + A t-} - - A ti, t +A 0, 

"A ti, t + A $ - {3^ k (t -A t it t +A 

indicate the net increase in load of S. contributed by S. and S k , 
respectively. Here, an adjustment scheme is introduced that 
does not require server-to-server communication. 
Thus using the above equations an adjusted balancing flow 
is given as: 

AXUjit + AtjJ = (10) 

4^/00 + finite - % t + a%) - p^ft - Afy t + At;} 

Experimental results show that both adjustment schemes can 
greatly improve the performance of the load balancing, with 
the adaptive adjustment scheme performs even better on 
average in the experiments. 

C. The Effect of Time Delays on the Stability of Load 
Balancing Algorithms for Parallel Computations 

The main objective that has been proposed is to analyse 
the effects of delays in the exchange of information among 
computational elements (CE), and the constraints these 
effects impose on the design of a load balancing strategy. A 
deterministic dynamic nonlinear time-delay system is 
developed to model load balancing. The model is shown to 
be self consistent in that the queue engths cannot go 
negative[3] and that the total number of tasks in all the queues 
and the network is conserved (i.e., load balancing can neither 
create nor lose tasks). 
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C Methodology 

The computers are assigned an equal number of 
homogenous tasks. Some of the nodes generate more tasks 
and very quickly the loads on various nodes become unequal. 
To balance the loads, each computer in the network sends its 

queue size ?jK to all other computers in the network. A 

node i receive this information from node j delayed by a finite 

amount of time T ij . 

Each node then uses this information to compute its 
estimate of the network average of the number of tasks in all 
n queues of the network. Based on the most recent 
observations, the local estimate of the network average is 
computed by the i" 1 node as Node i then compare its queue 

size ?jK with its estimate of the network average by 

estimating its excess load q^t) - Gj^i?;^ _ T ijX- ^ it's 
excess load is greater than zero or some positive threshold, 
the node sends some of its tasks to the other nodes. If it is 
less than zero, no tasks are sent. Further, the tasks sent by 
node i are received by node j with a delay h^. The load 
balancing algorithm decides how often to do load balancing 
and how many tasks are to be sent to each node. 
The mathematical model of the task load dynamics at a given 
computing node is given by 

d " t . 

— x, (t) = 2, - jx, + u, (t) - £ Pa f-Uj (t - h i} ) (11) 

at j= i t pj 

Where x^t) is the expected waiting time 

> is the rate of generation of waiting times on the ith 
node caused by the addition of tasks. 
Jlj ^ is the rate of reduction in waiting time caused by 
the service of tasks at the i th node 
ttj CO is the rate of removal (transfer) of the tasks from 
node at time by the load balancing algorithm at node. 
Pij is the fraction of the i th node's tasks to be sent out that 
it sends to the i ,h node. 

The quantity _ Pij u jft _ hi£ is the rate of increase (rate 
of transfer) of the expected waiting Node j performs this 
computation for all the other nodes and then portions out its 
tasks among the other nodes according to the amounts they 
are below the local average, that is 

«t.>} apj -*ift-rjj (12) 

The model was shown to be consistent in that the total 
number of tasks is conserved and the queues were always 
nonnegative and also the system was shown to be always 
stable. The comparative study of the papers is tubulised as 
shown in Table I 

D. Proposed Work 

The Architecture of the proposed system will be 
consisting of the server and the number of clients where 
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Table I. Comparison Of Above Load Balancing Techniques 



Papers 
parameters 


'"On Delay Adjustment for Dynamic 
L oad Balancing in Distributed Virtual 
Environments T 


'"CDmmumcatiDn-Aware Load Balancing for 
Parallel Applications on Clusters"' 


'The Effect of Time Delays on the 
Stability o f L oadBalancing Algorithms 
for Parallel Computations 11 


E UY it LT ill 


Distributed virtual environment. 


Clusters computation 


du ster Computation 


Technique 


The change in the load o f the s ervers 
du e to network delay wouldaf Fact the 
performance of the load balancing 
algorithm. 


■ This scheme is designed to reduce the 
p erfonnance gap between the ef fectiv e 
sp eed of CPU and network res ources . 

■ Behavioural model is introduced to 
capture the resource requirements. COM 
aware uses mis model for the effective 
BW utilization 


The effects o f delays in the exchange of 
information among CEs. and effects 
impose onth = design of aloadbalancing 
strategy. 


Points [ d be 

f ocu s sed 


■ L oad Dynamics is considered 

■ ^Vhether existing load or the 
newly added load to be 
transferred is not considered 


■ Load Dynamics is considered 

■ Selection process of the node to transfer 
the load is to be considered 


■ Load Dynamics is not considered 
» Selection process of the node to 
transfer theloadis to b e c onsidered 



clients run with the interactive applications whose load will 
be continuously varying. At the server side highest priority 
will be given to the interactive applications. 
The proposed simple design model is as shown below Fig. 1 



Applications 



? Adaptive Scheduler 



Processing ^ 



Priority Queue 



Fig. 1. Proposed Design 

Conclusions 

Due to communication delays among servers, the load 
balancing process may be using outdated load information 
to conduct load migration. This would significantly affect 
the performance of the load balancing algorithm. We need to 
reduce the latency for Interactive applications such that it is 
in lined with the degree of load change. From literature survey 
study we find that latency for a network has high impact of 
software overhead then delay of a network. Hence we have 
proposed the algorithm in which the queuing delay is 
minimized by giving the highest priority to the Interactive 
applications. 
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